Skip to content

Skip synchronized(unsafeTags) on owner-thread tag writes#11082

Draft
bm1549 wants to merge 5 commits intomasterfrom
brian.marks/thread-owned-tags
Draft

Skip synchronized(unsafeTags) on owner-thread tag writes#11082
bm1549 wants to merge 5 commits intomasterfrom
brian.marks/thread-owned-tags

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented Apr 10, 2026

What Does This Do

Optimizes span tag writes by skipping synchronized blocks when the creating (owner) thread writes tags — the ~95% common case. Introduces a withLock() API on TagMap for compound operations that need multi-operation atomicity.

Reworked based on reviewer feedback to address correctness issues and encapsulate all locking inside TagMap:

  • TagMap.withLock(Runnable/Supplier) replaces all synchronized(unsafeTags) in DDSpanContext — locking is now fully encapsulated in TagMap, not spread across callers
  • OptimizedTagMap gains ownership checks on all previously-unprotected methods (size(), isEmpty(), containsValue(), freeze(), isFrozen(), toString(), immutableCopy())
  • revokeOwnership() helper avoids redundant volatile writes on the cache line after transition
  • withLock() revokes ownership before running the operation, closing the TOCTOU race
  • LegacyTagMap gets self-synchronization on all key HashMap methods
  • Long-running spans disable the optimization entirely (writer thread reads tags on unfinished spans)

Motivation

The synchronized(unsafeTags) blocks in DDSpanContext add ~15-30ns per uncontended monitor enter/exit. With 15-30 tags per span across millions of spans, this overhead is meaningful. Since ~95% of tag writes happen on the creating thread, the owner-thread fast path avoids this cost.

Additional Notes

  • tag: ai generated
  • The TOCTOU race window during transition produces at most a lost update (JVM reference writes are atomic per JLS 17.7), not structural corruption
  • Compound operations (serialization, bulk writes, sampling priority) use withLock() which always acquires the monitor
  • After transitionToShared(), all operations go through synchronized(this) — compound atomicity is preserved

Contributor Checklist

  • Format the title according to the contribution guidelines
  • Assign the type: and (comp: or inst:) labels
  • Avoid using close, fix, or any linking keywords

Spans are almost always written by a single thread, so the lock on every
setTag/setMetric call is uncontended overhead. This adds a volatile
tagWriteState check: if the current thread is the span's creating thread
(STATE_OWNER), tag writes skip the lock entirely. Non-owner threads and
post-finish writes take the lock and sticky-transition to STATE_SHARED.

Long-running spans disable the optimization at construction since the
writer thread may read tags on unfinished spans.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 added type: enhancement Enhancements and improvements comp: core Tracer core tag: ai generated Largely based on code generated by an AI or LLM labels Apr 10, 2026
@bm1549 bm1549 changed the title Skip synchronized(unsafeTags) on owner-thread tag writes WIP - DO NOT REVIEW: Skip synchronized(unsafeTags) on owner-thread tag writes Apr 10, 2026
@bm1549 bm1549 marked this pull request as ready for review April 10, 2026 19:05
@bm1549 bm1549 requested a review from a team as a code owner April 10, 2026 19:05
@bm1549 bm1549 requested a review from mhlidd April 10, 2026 19:05
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Apr 10, 2026

Benchmarks

⚠️ Warning: Baseline build not found for merge-base commit. Comparing against the latest commit on master instead.

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/thread-owned-tags
git_commit_date 1776186597 1776186993
git_commit_sha f064e18 8d5c038
release_version 1.62.0-SNAPSHOT~f064e18a6c 1.62.0-SNAPSHOT~8d5c038a1d
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1776188732 1776188732
ci_job_id 1594656898 1594656898
ci_pipeline_id 107655738 107655738
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-msrslgcs 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-msrslgcs 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 59 metrics, 12 unstable metrics.

Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~8d5c038a1d, baseline=1.62.0-SNAPSHOT~f064e18a6c

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.061 s) : 0, 1060746
Total [baseline] (11.04 s) : 0, 11039555
Agent [candidate] (1.062 s) : 0, 1061936
Total [candidate] (11.065 s) : 0, 11064679
section appsec
Agent [baseline] (1.25 s) : 0, 1249580
Total [baseline] (11.11 s) : 0, 11110236
Agent [candidate] (1.253 s) : 0, 1252668
Total [candidate] (11.081 s) : 0, 11080803
section iast
Agent [baseline] (1.231 s) : 0, 1231355
Total [baseline] (11.391 s) : 0, 11390969
Agent [candidate] (1.222 s) : 0, 1222292
Total [candidate] (11.348 s) : 0, 11348410
section profiling
Agent [baseline] (1.192 s) : 0, 1192401
Total [baseline] (11.012 s) : 0, 11012053
Agent [candidate] (1.197 s) : 0, 1197363
Total [candidate] (11.047 s) : 0, 11046759
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.061 s -
Agent appsec 1.25 s 188.833 ms (17.8%)
Agent iast 1.231 s 170.609 ms (16.1%)
Agent profiling 1.192 s 131.655 ms (12.4%)
Total tracing 11.04 s -
Total appsec 11.11 s 70.681 ms (0.6%)
Total iast 11.391 s 351.414 ms (3.2%)
Total profiling 11.012 s -27.502 ms (-0.2%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.062 s -
Agent appsec 1.253 s 190.732 ms (18.0%)
Agent iast 1.222 s 160.356 ms (15.1%)
Agent profiling 1.197 s 135.427 ms (12.8%)
Total tracing 11.065 s -
Total appsec 11.081 s 16.124 ms (0.1%)
Total iast 11.348 s 283.731 ms (2.6%)
Total profiling 11.047 s -17.919 ms (-0.2%)
gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~8d5c038a1d, baseline=1.62.0-SNAPSHOT~f064e18a6c

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.233 ms) : 0, 1233
crashtracking [candidate] (1.224 ms) : 0, 1224
BytebuddyAgent [baseline] (633.695 ms) : 0, 633695
BytebuddyAgent [candidate] (635.536 ms) : 0, 635536
AgentMeter [baseline] (29.459 ms) : 0, 29459
AgentMeter [candidate] (29.604 ms) : 0, 29604
GlobalTracer [baseline] (249.927 ms) : 0, 249927
GlobalTracer [candidate] (250.035 ms) : 0, 250035
AppSec [baseline] (32.688 ms) : 0, 32688
AppSec [candidate] (32.09 ms) : 0, 32090
Debugger [baseline] (60.457 ms) : 0, 60457
Debugger [candidate] (60.617 ms) : 0, 60617
Remote Config [baseline] (596.675 µs) : 0, 597
Remote Config [candidate] (600.657 µs) : 0, 601
Telemetry [baseline] (8.134 ms) : 0, 8134
Telemetry [candidate] (8.239 ms) : 0, 8239
Flare Poller [baseline] (8.329 ms) : 0, 8329
Flare Poller [candidate] (7.717 ms) : 0, 7717
section appsec
crashtracking [baseline] (1.233 ms) : 0, 1233
crashtracking [candidate] (1.217 ms) : 0, 1217
BytebuddyAgent [baseline] (662.827 ms) : 0, 662827
BytebuddyAgent [candidate] (664.215 ms) : 0, 664215
AgentMeter [baseline] (12.071 ms) : 0, 12071
AgentMeter [candidate] (12.139 ms) : 0, 12139
GlobalTracer [baseline] (248.859 ms) : 0, 248859
GlobalTracer [candidate] (250.152 ms) : 0, 250152
IAST [baseline] (24.55 ms) : 0, 24550
IAST [candidate] (24.685 ms) : 0, 24685
AppSec [baseline] (185.074 ms) : 0, 185074
AppSec [candidate] (185.043 ms) : 0, 185043
Debugger [baseline] (66.02 ms) : 0, 66020
Debugger [candidate] (65.976 ms) : 0, 65976
Remote Config [baseline] (617.702 µs) : 0, 618
Remote Config [candidate] (616.528 µs) : 0, 617
Telemetry [baseline] (8.341 ms) : 0, 8341
Telemetry [candidate] (8.588 ms) : 0, 8588
Flare Poller [baseline] (3.525 ms) : 0, 3525
Flare Poller [candidate] (3.578 ms) : 0, 3578
section iast
crashtracking [baseline] (1.231 ms) : 0, 1231
crashtracking [candidate] (1.208 ms) : 0, 1208
BytebuddyAgent [baseline] (804.843 ms) : 0, 804843
BytebuddyAgent [candidate] (799.163 ms) : 0, 799163
AgentMeter [baseline] (11.485 ms) : 0, 11485
AgentMeter [candidate] (11.312 ms) : 0, 11312
GlobalTracer [baseline] (240.569 ms) : 0, 240569
GlobalTracer [candidate] (239.09 ms) : 0, 239090
IAST [baseline] (26.858 ms) : 0, 26858
IAST [candidate] (25.849 ms) : 0, 25849
AppSec [baseline] (29.086 ms) : 0, 29086
AppSec [candidate] (31.154 ms) : 0, 31154
Debugger [baseline] (63.114 ms) : 0, 63114
Debugger [candidate] (60.912 ms) : 0, 60912
Remote Config [baseline] (2.294 ms) : 0, 2294
Remote Config [candidate] (522.524 µs) : 0, 523
Telemetry [baseline] (12.117 ms) : 0, 12117
Telemetry [candidate] (13.059 ms) : 0, 13059
Flare Poller [baseline] (3.538 ms) : 0, 3538
Flare Poller [candidate] (3.488 ms) : 0, 3488
section profiling
crashtracking [baseline] (1.192 ms) : 0, 1192
crashtracking [candidate] (1.184 ms) : 0, 1184
BytebuddyAgent [baseline] (696.464 ms) : 0, 696464
BytebuddyAgent [candidate] (697.789 ms) : 0, 697789
AgentMeter [baseline] (9.158 ms) : 0, 9158
AgentMeter [candidate] (9.22 ms) : 0, 9220
GlobalTracer [baseline] (208.532 ms) : 0, 208532
GlobalTracer [candidate] (209.882 ms) : 0, 209882
AppSec [baseline] (33.092 ms) : 0, 33092
AppSec [candidate] (33.14 ms) : 0, 33140
Debugger [baseline] (66.066 ms) : 0, 66066
Debugger [candidate] (66.404 ms) : 0, 66404
Remote Config [baseline] (586.264 µs) : 0, 586
Remote Config [candidate] (575.457 µs) : 0, 575
Telemetry [baseline] (7.847 ms) : 0, 7847
Telemetry [candidate] (8.029 ms) : 0, 8029
Flare Poller [baseline] (3.567 ms) : 0, 3567
Flare Poller [candidate] (3.623 ms) : 0, 3623
ProfilingAgent [baseline] (94.073 ms) : 0, 94073
ProfilingAgent [candidate] (95.771 ms) : 0, 95771
Profiling [baseline] (94.676 ms) : 0, 94676
Profiling [candidate] (96.348 ms) : 0, 96348
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~8d5c038a1d, baseline=1.62.0-SNAPSHOT~f064e18a6c

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.065 s) : 0, 1064881
Total [baseline] (8.864 s) : 0, 8864395
Agent [candidate] (1.057 s) : 0, 1056624
Total [candidate] (8.828 s) : 0, 8828009
section iast
Agent [baseline] (1.223 s) : 0, 1222942
Total [baseline] (9.54 s) : 0, 9540121
Agent [candidate] (1.228 s) : 0, 1227861
Total [candidate] (9.573 s) : 0, 9573282
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.065 s -
Agent iast 1.223 s 158.061 ms (14.8%)
Total tracing 8.864 s -
Total iast 9.54 s 675.725 ms (7.6%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.057 s -
Agent iast 1.228 s 171.238 ms (16.2%)
Total tracing 8.828 s -
Total iast 9.573 s 745.273 ms (8.4%)
gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~8d5c038a1d, baseline=1.62.0-SNAPSHOT~f064e18a6c

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.235 ms) : 0, 1235
crashtracking [candidate] (1.228 ms) : 0, 1228
BytebuddyAgent [baseline] (638.11 ms) : 0, 638110
BytebuddyAgent [candidate] (633.611 ms) : 0, 633611
AgentMeter [baseline] (29.605 ms) : 0, 29605
AgentMeter [candidate] (29.419 ms) : 0, 29419
GlobalTracer [baseline] (249.999 ms) : 0, 249999
GlobalTracer [candidate] (248.799 ms) : 0, 248799
AppSec [baseline] (32.671 ms) : 0, 32671
AppSec [candidate] (32.0 ms) : 0, 32000
Debugger [baseline] (59.781 ms) : 0, 59781
Debugger [candidate] (59.162 ms) : 0, 59162
Remote Config [baseline] (602.581 µs) : 0, 603
Remote Config [candidate] (625.305 µs) : 0, 625
Telemetry [baseline] (8.153 ms) : 0, 8153
Telemetry [candidate] (8.092 ms) : 0, 8092
Flare Poller [baseline] (8.359 ms) : 0, 8359
Flare Poller [candidate] (7.487 ms) : 0, 7487
section iast
crashtracking [baseline] (1.227 ms) : 0, 1227
crashtracking [candidate] (1.237 ms) : 0, 1237
BytebuddyAgent [baseline] (800.33 ms) : 0, 800330
BytebuddyAgent [candidate] (805.087 ms) : 0, 805087
AgentMeter [baseline] (11.374 ms) : 0, 11374
AgentMeter [candidate] (11.608 ms) : 0, 11608
GlobalTracer [baseline] (239.024 ms) : 0, 239024
GlobalTracer [candidate] (239.509 ms) : 0, 239509
AppSec [baseline] (32.129 ms) : 0, 32129
AppSec [candidate] (30.134 ms) : 0, 30134
Debugger [baseline] (61.718 ms) : 0, 61718
Debugger [candidate] (63.053 ms) : 0, 63053
Remote Config [baseline] (533.982 µs) : 0, 534
Remote Config [candidate] (536.316 µs) : 0, 536
Telemetry [baseline] (11.167 ms) : 0, 11167
Telemetry [candidate] (11.08 ms) : 0, 11080
Flare Poller [baseline] (3.441 ms) : 0, 3441
Flare Poller [candidate] (3.452 ms) : 0, 3452
IAST [baseline] (25.846 ms) : 0, 25846
IAST [candidate] (25.927 ms) : 0, 25927
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/thread-owned-tags
git_commit_date 1776186597 1776186993
git_commit_sha f064e18 8d5c038
release_version 1.62.0-SNAPSHOT~f064e18a6c 1.62.0-SNAPSHOT~8d5c038a1d
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1776189296 1776189296
ci_job_id 1594656901 1594656901
ci_pipeline_id 107655738 107655738
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-2-klyjto3z 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-2-klyjto3z 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 1 performance regressions! Performance is the same for 19 metrics, 16 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:petclinic:no_agent:high_load worse
[+0.356ms; +1.845ms] or [+2.100%; +10.876%]
unstable
[-0.415ms; +2.588ms] or [-1.451%; +9.052%]
unstable
[-44.797op/s; +16.735op/s] or [-16.745%; +6.255%]
18.066ms 29.678ms 253.500op/s 16.966ms 28.592ms 267.531op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~8d5c038a1d, baseline=1.62.0-SNAPSHOT~f064e18a6c
    dateFormat X
    axisFormat %s
section baseline
no_agent (17.443 ms) : 17268, 17617
.   : milestone, 17443,
appsec (18.813 ms) : 18623, 19002
.   : milestone, 18813,
code_origins (18.095 ms) : 17916, 18274
.   : milestone, 18095,
iast (18.155 ms) : 17974, 18336
.   : milestone, 18155,
profiling (18.425 ms) : 18245, 18605
.   : milestone, 18425,
tracing (17.805 ms) : 17631, 17980
.   : milestone, 17805,
section candidate
no_agent (18.409 ms) : 18224, 18593
.   : milestone, 18409,
appsec (18.918 ms) : 18727, 19109
.   : milestone, 18918,
code_origins (18.051 ms) : 17877, 18226
.   : milestone, 18051,
iast (18.108 ms) : 17928, 18287
.   : milestone, 18108,
profiling (18.507 ms) : 18323, 18691
.   : milestone, 18507,
tracing (17.905 ms) : 17728, 18082
.   : milestone, 17905,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 17.443 ms [17.268 ms, 17.617 ms] -
appsec 18.813 ms [18.623 ms, 19.002 ms] 1.37 ms (7.9%)
code_origins 18.095 ms [17.916 ms, 18.274 ms] 652.216 µs (3.7%)
iast 18.155 ms [17.974 ms, 18.336 ms] 712.393 µs (4.1%)
profiling 18.425 ms [18.245 ms, 18.605 ms] 982.643 µs (5.6%)
tracing 17.805 ms [17.631 ms, 17.98 ms] 362.584 µs (2.1%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.409 ms [18.224 ms, 18.593 ms] -
appsec 18.918 ms [18.727 ms, 19.109 ms] 509.419 µs (2.8%)
code_origins 18.051 ms [17.877 ms, 18.226 ms] -357.782 µs (-1.9%)
iast 18.108 ms [17.928 ms, 18.287 ms] -301.153 µs (-1.6%)
profiling 18.507 ms [18.323 ms, 18.691 ms] 98.076 µs (0.5%)
tracing 17.905 ms [17.728 ms, 18.082 ms] -503.996 µs (-2.7%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~8d5c038a1d, baseline=1.62.0-SNAPSHOT~f064e18a6c
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.226 ms) : 1215, 1238
.   : milestone, 1226,
iast (3.396 ms) : 3350, 3441
.   : milestone, 3396,
iast_FULL (6.086 ms) : 6024, 6148
.   : milestone, 6086,
iast_GLOBAL (3.732 ms) : 3669, 3795
.   : milestone, 3732,
profiling (2.243 ms) : 2222, 2265
.   : milestone, 2243,
tracing (1.97 ms) : 1953, 1986
.   : milestone, 1970,
section candidate
no_agent (1.228 ms) : 1216, 1239
.   : milestone, 1228,
iast (3.361 ms) : 3312, 3409
.   : milestone, 3361,
iast_FULL (5.978 ms) : 5917, 6039
.   : milestone, 5978,
iast_GLOBAL (3.658 ms) : 3598, 3718
.   : milestone, 3658,
profiling (2.346 ms) : 2325, 2367
.   : milestone, 2346,
tracing (1.847 ms) : 1832, 1862
.   : milestone, 1847,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.226 ms [1.215 ms, 1.238 ms] -
iast 3.396 ms [3.35 ms, 3.441 ms] 2.169 ms (176.9%)
iast_FULL 6.086 ms [6.024 ms, 6.148 ms] 4.859 ms (396.3%)
iast_GLOBAL 3.732 ms [3.669 ms, 3.795 ms] 2.506 ms (204.4%)
profiling 2.243 ms [2.222 ms, 2.265 ms] 1.017 ms (83.0%)
tracing 1.97 ms [1.953 ms, 1.986 ms] 743.516 µs (60.6%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.228 ms [1.216 ms, 1.239 ms] -
iast 3.361 ms [3.312 ms, 3.409 ms] 2.133 ms (173.7%)
iast_FULL 5.978 ms [5.917 ms, 6.039 ms] 4.75 ms (386.9%)
iast_GLOBAL 3.658 ms [3.598 ms, 3.718 ms] 2.43 ms (197.9%)
profiling 2.346 ms [2.325 ms, 2.367 ms] 1.118 ms (91.1%)
tracing 1.847 ms [1.832 ms, 1.862 ms] 619.229 µs (50.4%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/thread-owned-tags
git_commit_date 1776186597 1776186993
git_commit_sha f064e18 8d5c038
release_version 1.62.0-SNAPSHOT~f064e18a6c 1.62.0-SNAPSHOT~8d5c038a1d
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1776189023 1776189023
ci_job_id 1594656903 1594656903
ci_pipeline_id 107655738 107655738
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-3-scnuo999 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-3-scnuo999 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~8d5c038a1d, baseline=1.62.0-SNAPSHOT~f064e18a6c
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.265 s) : 15265000, 15265000
.   : milestone, 15265000,
appsec (14.687 s) : 14687000, 14687000
.   : milestone, 14687000,
iast (18.422 s) : 18422000, 18422000
.   : milestone, 18422000,
iast_GLOBAL (18.194 s) : 18194000, 18194000
.   : milestone, 18194000,
profiling (15.462 s) : 15462000, 15462000
.   : milestone, 15462000,
tracing (14.752 s) : 14752000, 14752000
.   : milestone, 14752000,
section candidate
no_agent (15.584 s) : 15584000, 15584000
.   : milestone, 15584000,
appsec (14.739 s) : 14739000, 14739000
.   : milestone, 14739000,
iast (18.429 s) : 18429000, 18429000
.   : milestone, 18429000,
iast_GLOBAL (18.006 s) : 18006000, 18006000
.   : milestone, 18006000,
profiling (15.03 s) : 15030000, 15030000
.   : milestone, 15030000,
tracing (14.867 s) : 14867000, 14867000
.   : milestone, 14867000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.265 s [15.265 s, 15.265 s] -
appsec 14.687 s [14.687 s, 14.687 s] -578.0 ms (-3.8%)
iast 18.422 s [18.422 s, 18.422 s] 3.157 s (20.7%)
iast_GLOBAL 18.194 s [18.194 s, 18.194 s] 2.929 s (19.2%)
profiling 15.462 s [15.462 s, 15.462 s] 197.0 ms (1.3%)
tracing 14.752 s [14.752 s, 14.752 s] -513.0 ms (-3.4%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.584 s [15.584 s, 15.584 s] -
appsec 14.739 s [14.739 s, 14.739 s] -845.0 ms (-5.4%)
iast 18.429 s [18.429 s, 18.429 s] 2.845 s (18.3%)
iast_GLOBAL 18.006 s [18.006 s, 18.006 s] 2.422 s (15.5%)
profiling 15.03 s [15.03 s, 15.03 s] -554.0 ms (-3.6%)
tracing 14.867 s [14.867 s, 14.867 s] -717.0 ms (-4.6%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~8d5c038a1d, baseline=1.62.0-SNAPSHOT~f064e18a6c
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.488 ms) : 1476, 1499
.   : milestone, 1488,
appsec (2.557 ms) : 2502, 2612
.   : milestone, 2557,
iast (2.278 ms) : 2210, 2347
.   : milestone, 2278,
iast_GLOBAL (2.32 ms) : 2251, 2389
.   : milestone, 2320,
profiling (2.106 ms) : 2052, 2161
.   : milestone, 2106,
tracing (2.094 ms) : 2041, 2148
.   : milestone, 2094,
section candidate
no_agent (1.493 ms) : 1481, 1504
.   : milestone, 1493,
appsec (3.806 ms) : 3584, 4027
.   : milestone, 3806,
iast (2.283 ms) : 2214, 2352
.   : milestone, 2283,
iast_GLOBAL (2.327 ms) : 2257, 2397
.   : milestone, 2327,
profiling (2.105 ms) : 2050, 2159
.   : milestone, 2105,
tracing (2.083 ms) : 2029, 2136
.   : milestone, 2083,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.488 ms [1.476 ms, 1.499 ms] -
appsec 2.557 ms [2.502 ms, 2.612 ms] 1.07 ms (71.9%)
iast 2.278 ms [2.21 ms, 2.347 ms] 790.678 µs (53.2%)
iast_GLOBAL 2.32 ms [2.251 ms, 2.389 ms] 832.672 µs (56.0%)
profiling 2.106 ms [2.052 ms, 2.161 ms] 618.859 µs (41.6%)
tracing 2.094 ms [2.041 ms, 2.148 ms] 606.648 µs (40.8%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.493 ms [1.481 ms, 1.504 ms] -
appsec 3.806 ms [3.584 ms, 4.027 ms] 2.313 ms (154.9%)
iast 2.283 ms [2.214 ms, 2.352 ms] 790.059 µs (52.9%)
iast_GLOBAL 2.327 ms [2.257 ms, 2.397 ms] 834.103 µs (55.9%)
profiling 2.105 ms [2.05 ms, 2.159 ms] 611.918 µs (41.0%)
tracing 2.083 ms [2.029 ms, 2.136 ms] 589.691 µs (39.5%)

@bm1549 bm1549 marked this pull request as draft April 10, 2026 19:23
@bm1549 bm1549 changed the title WIP - DO NOT REVIEW: Skip synchronized(unsafeTags) on owner-thread tag writes Skip synchronized(unsafeTags) on owner-thread tag writes Apr 10, 2026
Add three targeted concurrency tests that exercise the exact cross-thread
tag write pattern the JMH crossThread benchmark was measuring:
- crossThreadSustainedNoCrash: 8 threads × 10k setTag on same span
- ownerToSharedTransition: owner writes first, then 8 threads join
- manySpansCrossThread: 10k short-lived spans tagged from 8 threads

All pass, proving the production code handles cross-thread writes without
NPE or structural corruption.

Fix the crossThread benchmark: change SharedSpan @setup from
Level.Invocation to Level.Iteration. With Level.Invocation, 8 threads
raced to call setup() concurrently, causing NPE when state.span was
transiently null between invocations — a benchmark harness bug, not a
production code bug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at throughput numbers, there's a modest gain that might make this change worth the complexity if we can simplify it a bit.

The two main changes that I think we should try are...

  • replacing the two volatiles with a single volatile for the owningThread
  • introducing high-order helper functions that hide the locking complexity

Then all the duplicate code can go away, and the accessing code becomes something like...
accessTags(unsafeTags -> {
...
});
We just need to make sure not to introduce a capturing lambda, so we don't incur unneeded allocation.

@dougqh
Copy link
Copy Markdown
Contributor

dougqh commented Apr 13, 2026

Looking at throughput numbers, there's a modest gain that might make this change worth the complexity if we can simplify it a bit.

The two main changes that I think we should try are...

  • replacing the two volatiles with a single volatile for the owningThread
  • introducing high-order helper functions that hide the locking complexity

Then all the duplicate code can go away, and the accessing code becomes something like... accessTags(unsafeTags -> { ... }); We just need to make sure not to introduce a capturing lambda, so we don't incur unneeded allocation.

Alas, creating a higher-order helper function in DDSpanContext proved more annoying than anticipated.
Or at least, it is hard to do so while also avoiding variable closure than I expected.

I'm back to thinking if we're going to do this change, we should do it in OptimizedTagMap. At least in OptimizedTagMap, most methods are sugar around a few methods that just work with a single TagMap.Entry.

Per review feedback, move the lock-skipping optimization from
DDSpanContext into OptimizedTagMap itself. This keeps the optimization
invisible to callers — DDSpanContext no longer needs synchronized blocks
around tag operations, and developers adding new tag operations don't
need to think about locking.

OptimizedTagMap now has a volatile Thread ownerThread field. Core methods
(getAndSet, getAndRemove, getEntry, putAll, forEach, copy, etc.) check
ownership: owner thread skips the lock, non-owner threads synchronize
and permanently transition to shared mode.

DDSpanContext changes: removed all 27 synchronized(unsafeTags) blocks,
added setOwnerThread(current) in constructor, transitionToShared()
delegates to TagMap.

Also adds @threads(8) JMH benchmark variants and 5 new concurrency
tests (mixed read/write, fuzz, value consistency, finish race,
concurrent metrics).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, this change creates a correctness issue when LegacyTagMap is being used.
As far I can tell, I think this is case where we've reached the limits of AI coding.

Before we consider such a change, I think we first need to phase out LegacyTagMap. That can be done in a separate pull request.

But even then, I think this introduces an error prone / complicated ownership model for only a small gain in real user benchmarks. I don't think that's a good trade-off.

I think we might be able to use this PR as inspiration for a future change, but I think how it fits in needs to be thought through carefully.

bm1549 and others added 2 commits April 14, 2026 13:02
Address reviewer feedback on the thread-ownership optimization:

- Add TagMap.withLock(Runnable/Supplier) API for compound operations,
  replacing all synchronized(unsafeTags) in DDSpanContext with
  unsafeTags.withLock() — locking is now fully encapsulated in TagMap
- Add missing ownership checks to OptimizedTagMap: size(), isEmpty(),
  containsValue(), freeze(), isFrozen(), toString(), immutableCopy()
- Add revokeOwnership() helper to avoid redundant volatile writes on
  the cache line after transition to shared mode
- withLock() revokes ownership before running the operation, closing
  the TOCTOU race where the owner thread could bypass the compound lock
- Add self-synchronization to LegacyTagMap (synchronized on all key
  HashMap methods + withLock override) so it is safe without outer locks
- Add TagMapConcurrencyTest: owner fast path, transition visibility,
  post-transition contention, withLock atomicity, size consistency
- Expand DDSpanContextConcurrencyTest: compound atomicity for
  setSpanSamplingPriority, getTags snapshot consistency, transition +
  concurrent read safety

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SpotBugs flags *Impl methods in OptimizedTagMap for non-atomic access to
the `size` and `frozen` fields. These methods run either on the owner
thread (lock-free by design) or inside synchronized(this) via reentrant
calls. Also fix null-safe containsValue using Objects.equals.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: core Tracer core tag: ai generated Largely based on code generated by an AI or LLM type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants